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Abstract. A recently identified problem is that of finding an optimal investment 
plan for a transportation network, given that a disaster such as an earthquake 
may destroy links in the network. The aim is to strengthen key links to preserve 
the expected network connectivity. A network based on the Istanbul highway 
system has thirty links and therefore a billion scenarios, but it has been estimated 
that sampling a million scenarios gives reasonable accuracy. In this paper we use 
symmetry reasoning to reduce the number of scenarios to a much smaller number, 
making sampling unnecessary. This result can be used to facilitate metaheuristic 
and exact approaches to the problem. 

1 Introduction 

We consider a known problem in pre-disaster planning: forming an investment plan 
for a transportation network, with the aim of facilitating rescue operations in the case 
of earthquakes. Multi-stage stochastic problems occur in many real-life situations and 
are often tackled by Stochastic Programming (SP) methods based on Integer or Math- 
ematical Programming. These methods are guaranteed to find an optimal solutions, but 
because of the complexity of the problems they may only be practicable for small in- 
stances. 

The use of metaheuristics such as Tabu Search, Simulated Annealing, Genetic Al- 
gorithms and Ant Colony Optimisation is another promising approach to such prob- 
lems. Though not guaranteed to find optimal solutions, metaheuristics can often find 
near-optimal solutions in a reasonable time. But applying metaheuristics requires the 
computation of an objective function (or fitness), which can be prohibitively expen- 
sive for stochastic problems. A common way of reducing the computational effort is 
approximation by scenario sampling. 

We propose an alternative method to sampling for the earthquake problem, which 
does not involve approximation but which makes the fitness computation tractable. The 
paper is organised as follows. Section|2]describes the problem and an existing approach 
to solving it. Section [3] outlines a standard metaheuristic approach and points out its 
impracticality. Section|4]describes our method. Section[5]concludes the paper. 

2 A disaster pre-planning problem 

The problem is taken from Peeta et al. Ifl2l . Consider the transportation network in 
Figure Q] (not drawn to scale), with nodes numbered 1-25 and arcs numbered 1-30, 



each of whose arcs (which we shall refer to as links) may fail with some probability. 
This network is modelled on the Istanbul highway network. 




Fig. 1. Istanbul road network 



The failure probability of a link can be reduced by investing money in it, and we 
have a budget limiting the total investment. We aim to minimise the expected shortest 
path between a specified source and sink node in the network. More generally, we aim 
to minimise a weighted sum of shortest path lengths between several source-sink pairs, 
chosen (for example) to represent paths between hospitals and areas of high population. 
This is an example of pre-disaster planning, where a decision maker aims to maximise 
the robustness of a transportation network with respect to possible disasters in order to 
facilitate rescue operations. 

First some notation. Represent the network as an undirected graph G = (V, E) with 
nodes V and arcs or links E. For each link e 6 E define a binary decision variable y e 
which is 1 if we invest in that link and otherwise. Define a binary random variable r e 
which is 1 if link e survives and if it fails. Denote the survival (non-failure) probability 
of link e by p e without investment and q e with, the investment required for link e by 
c e , the length of link e by t e , and the budget by B. If source and sink are unconnected 
then the path length is taken to be a fixed number M representing (for example) the 
cost of using a helicopter. Actually, if they are only connected by long paths then they 
are considered to be unconnected, as in practice rescuers would resort to alternatives 
such as rescue by helicopter or sea. So Peeta et al. only consider a few shortest paths 
for each source-sink pair, as shown in Table Q] We shall refer to these as the allowed 
paths. In each case M is chosen to be the smallest integer that is greater than the longest 
allowed path length. They also consider a larger value of M = 120 that places a greater 
importance on connectivity. Let us replace M by 2 new constants: M a is the length 



below which a path is allowed, while M p is the penalty imposed when no allowed path 
exists. 
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Table 1. Allowed paths 



In SP terms this is a 2-stage problem. In the first stage we must decide which links 
to invest in, then link failures occur randomly. In the second stage we must choose a 
shortest path between the source and sink (the recourse action), given the surviving 
links. If the source and sink are no longer connected by an allowed path then a fixed 
penalty M p is imposed. Peeta et al. point out that, though a natural approach is to 
strengthen the weakest links, this does not necessarily lead to the best results. 

This is a challenging problem because each of the 30 links might independently be 
affected by earthquakes, giving 2 30 scenarios. Though optimisation time is not criti- 
cal in pre-disaster planning, over 1 billion scenarios is too many to be tractable. An- 
other source of difficulty is that the problem has endogenous uncertainty: the decisions 
(which links to invest in) affects the probabilities of the random events (the link fail- 
ures). Relatively little work has been done on such problems but they are usually much 
harder to solve by SP methods. For a survey on problems with endogenous uncertainty 
see Q, which mentions applications including network design and interdiction, server 



selection, facility location, and gas reservoir development. Other examples include clin- 
ical trial planning [4| and portfolio optimisation 1 15|. 

3 A metaheuristic approach 

Peeta et al. sample a million scenarios, and approximate the objective function by a 
monotonic multilinear function. They show that their method gives optimal or near- 
optimal results on smaller instances. We are interested in applying standard metaheuris- 
tics such as genetic algorithms to this problem. As noted in a recent survey of meta- 
heuristic approaches to stochastic problems [1|, most research is on continuous prob- 
lems and problems with noisy or time-varying fitness, and less work has been done on 
metaheuristics for multi-stage problems. 

An obvious approach to the earthquake problem is to use a population of chromo- 
somes, each with 30 binary genes corresponding to yi . . . y^Q. Thus each chromosome 
is a direct representation of an investment plan in which values of 1 indicate investment 
and no investment. Standard genetic operators (selection, recombination and muta- 
tion) can be applied to this model. To compute the fitness (objective function) of a chro- 
mosome we check every scenario, each representing a network realisation. For a given 
network realisation we compute the length of shortest paths between source-sink pairs 
using (for example) Dijkstra's algorithm, taking value M p if there is no path. From all 
the scenarios we can compute expected path lengths and hence the chromosome fitness. 

A slight complication is the budget constraint: a chromosome might contain too 
many 1 -values, corresponding to overspend. There are 3 ways of handling constraints 
in genetic algorithms [5 |: 

- Penalise constraint violation by adding a penalty function to the fitness. 

- Repair the chromosome so that it no longer violates any constraints. 

- Use a decoder to generate a feasible solution from the chromosome, which is 
treated not as a solution but as a set of instructions on how to construct one. 

Any of these approaches can be applied to this problem, and we propose a simple de- 
coder: consider the genes in a fixed order (the numerical order of links 1-30) and treat 
any 1 -value that would violate the budget constraint as a 0-value. The endogenous un- 
certainty is not a problem here: given an investment plan we can immediately deduce 
the survival probability for each link, which enables us to compute each scenario prob- 
ability and hence the path length expectations. 

Unfortunately, this straightforward approach is impractical because we must con- 
sider a billion scenarios to compute the fitness of each chromosome. In fact a major 
issue when solving stochastic problems by metaheuristics is the fitness computation, 
and there are 3 common approaches (TJ: 

- Use a closed-form expression to compute exact fitness. 

- Use a fast approximation to an expensive closed-form expression. 

- Estimate fitness by sampling scenarios, as in the field of Simulation Optimisation. 

We propose an alternative approach: compute fitness exactly by exploiting symmetries 
between scenarios, in order to bundle many of them together so that they can be con- 
sidered simultaneously. 



4 Exploiting symmetries between scenarios 



We shall illustrate our method using the simple example in Figure [2] The links e = 
1 ... 4 have lengths t e == 1, p e = 0.8, q e = 1, c e = 1, B = 1 and M Q = Af p = 3.5 
so that both possible paths between nodes 1—4 are allowed. We must choose 1 link to 
invest in, to minimise the expected shortest path length between nodes 1-4. There are 
16 scenarios, and the optimal policy is to invest in link 1, giving an expected shortest 
path length of 2.236. 




Fig. 2. A small network example 



4.1 Merging scenarios 

Some scenarios can be considered together instead of separately. For example consider 
two scenarios 1001 (in which links 1 and 4 survive but 2 and 3 do not) and 1101 (identi- 
cal except that link 2 survives). These scenarios have probabilities 0.8 x 0.2 x 0.2 x 0.8 = 
0.0256 and 0.8 x 0.8 x 0.2 x 0.8 = 0.1024 respectively. As link 3 does not survive, it is 
irrelevant whether or not link 2 survives because it cannot be part of a shortest path (or 
any path). We can therefore merge these two scenarios into one, which we shall write 
as liOl where "i" denotes interchangeability: the values and 1 for link 2 are inter- 
changeable. We shall refer to a combined scenario such as liOl as a multiscenario . As 
"i" includes both the and 1 values, the probability associated with the multiscenario 
is 0.8 x 1.0 x 0.2 x 0.8 = 0.128. 

However, it is impractical to enumerate a billion scenarios then look for ways of 
merging some of them. Instead suppose we enumerate scenarios by tree search on the 
random variables. Consider a node in the tree at which links 1 . . . i have been realised, 
so that random variables s\ . . . Sj_i have been assigned values, and we are about to 
assign a value to Sj corresponding to link e. Denote by £ e the shortest source-sink path 
length including e, under the assumption that all unrealised links survive; and denote 
by £ s the shortest source-sink path length not including e, under the assumption that 
all unrealised links fail (using M p when no path exists). So £ e is the minimum shortest 
path length including e in all scenarios below this policy tree node, while £ s is the 
maximum shortest path length not including e in the same scenarios. They can easily be 
computed by temporarily assigning s,_i . . . s n (where \E\ — n) to 1 or respectively, 
and applying a shortest path algorithm. Now if £ e > £ E then the value assigned to s.- L is 
irrelevant: the shortest path length in each scenario under this tree node is independent 
of the value of s,, so the values are interchangeable. 



It is important to note that the order in which we assign the s variables affects the 
size of the multiscenario set. Three multiscenario sets for the example are shown in 
Table |2j where p is the multiscenario probability, and 1 are the values of the random 
variables corresponding to the links, and "i" denotes interchangeable values: given the 
assignments to the left, the objective value is independent of whether the random vari- 
able is set to or 1 . The set of size 7 corresponds to the permutation corrsponding to 
the numerical link ordering 1234, the set of size 10 is the largest possible, and the set 
of size 5 is the smallest possible. Having derived the multiscenarios, we can replace the 
"i" entries by arbitrary values, for example 0. 
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Table 2. Three multiscenario sets for the small example 



4.2 Stochastic dominance, symmetry and network reliability 

The above ideas have parallels in several literatures. Firstly, one way of viewing this 
form of reasoning is as stochastic dominance 10, a concept from the Decision Theory 
literature: the objective function associated with one choice (0 or 1) is at least as good 
as with another choice (1 or 0). Because this holds in every scenario, it is the simplest 
form of stochastic dominance: statewise (or zeroth order) dominance. However, this 
is usually defined as a strict dominance by adding an extra condition: that one choice 
is strictly better than the other in at least one state (or scenario). In our case neither 
value is better so this is a weak dominance. In fact we have two values that each weakly 
dominate the other, a relationship that can be viewed as a symmetry: the tree is exactly 
the same whichever value we use for a link. But there does not seem to be an accepted 
term such as "stochastic symmetry" for this phenomenon. 

However, in the Constraint Programming and Artificial Intelligence literatures this 
type of symmetry is often used to reduce search tree sizes (for non-stochastic problems): 
see Chapter 10 of [14| for a survey of such techniques. Because the symmetry only 
occurs under certain assignments to some other variables, it is a conditional symmetry, 
the condition being that certain other assignments have occurred. And because it is a 



symmetry on values in the domain of a variable it is also a value interchangeability, a 
form of symmetry first investigated in 1 6 1 and since developed in many ways [ 8 ] . More 
specifically, it is a form called full dynamic interchangeability [8 13], the word dynamic 
having a similar meaning to conditional here. Though interchangeability has been the 
subject of considerable research, a drawback is that it does not seem to occur in many 
real applications 1121 1 II : we believe that it will occur more often in stochastic settings 
such as this one. 

The Network Reliability literature describes methods for evaluating and approxi- 
mating the reliability of a network. These include ways of pruning irrelevant parts of 
a network that have connections to our approach, though we have not found a direct 
parallel. For a discussion of these ideas see (3J. 

4.3 Application to the earthquake problem: single path 

For the earthquake problem we used a simple hill-climbing algorithm to find a good 
permutation, based on 2- and 3-exchange moves, and accepting moves that improve or 
leave unchanged the number of multiscenarios. The results are given in Table[3]for each 
source-sink pair considered separately, and took several minutes each to compute. The 
table shows the instances numbered 1-5, the source and sink, the chosen constant M a , 
the best link permutation found, and the size of the corresponding multiscenario set. 



instance 


pair 


link permutation 


multiscenarios 


1 


14-20 


20 22 2 21 3 17 18 6 25 15 1 23 4 24 11 
14 26 30 13 19 12 10 29 28 7 16 27 5 9 8 


69 


2 


14-7 


20 16 10 25 17 14 1 28 9 2 13 3 8 11 4 
30 12 29 26 21 15 5 6 24 23 7 18 19 22 27 


45 


3 


12-18 


18 22 23 9 21 20 2 12 24 8 7 11 26 17 13 
3 6 27 14 16 10 28 1 19 15 29 4 30 5 25 


79 


4 


9-7 


1 3 28 29 10 13 20 18 8 11 9 27 12 16 17 
21 30 19 25 24 2 4 14 22 5 23 26 7 6 15 


26 


5 


4-8 


6 12 24 5 8 18 4 19 9 3 21 23 28 7 10 
29 11 20 13 30 17 2 26 16 22 15 27 14 1 25 


124 



Table 3. Scenario reduction results for the earthquake problem 



Note that the choice of permutation has a significant effect. For the 5 instances, if 
we use the numerical link ordering we obtain multiscenario sets of sizes 4944, 4154, 
5268, 87 and 1488 respectively, but we might be more unlucky: we sampled 10 random 
permutations per instance and found worst-case multiscenario set sizes 31 124, 1 15760, 
21200, 994 and 7408 respectively. These are much better than 1 billion but consider- 
ably worse than the best sets we found, showing the advantage of searching for a good 
permutation. 

Table|4]shows in full the multiscenario set for the 4th instance, which has the small- 
est set. The table shows that the expected shortest path length from 9-7 is independent 



of the survival or failure of links 1, 2, 3, 4, 5, 8, 15, 18, 19, 20, 21, 22, 23, 24, 25, 26, 
27, 28, 29 and 30; it depends only on the remaining 10 links 6, 7, 9, 10, 11, 12, 13, 
14, 16 and 17. Therefore the multiscenario set for this instance should be no larger than 
2 10 = 1024. But among these 10 links there are interchangeabilities: for example if link 
10 fails then the expected length is independent of the survival or failure of link 13, but 
if link 10 survives then what happens to link 13 is important: if link 13 also survives 
(last line) then no other link matters because the 9-7 shortest path is available. 
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Table 4. Multiscenario set for instance 4 



4.4 Extension to multiple paths 

We aim to minimise the expected weighted sum of shortest path lengths between 
several source-sink pairs: 

Minimise z = E | ^ Wil% j 



for weights Wi. Unfortunately, there is likely to be little interchangeability in this prob- 
lem, especially if (as we would expect) the pairs are chosen to cover most of the net- 
work: for a given link to be irrelevant to the lengths of several paths is much less likely 
than for one path. But we can avoid this drawback by rewriting the objective function 
as: 



so that each path is treated separately, and can be evaluated using its own link per- 
mutation. If we do this using the permutations shown in Table [3] the total number of 
multiscenarios is 343, so to evaluate an investment plan we need consider only this 
many multiscenarios: a very tractable. Note that we compute fitness exactly, so the op- 
timal investment plan is guaranteed to occur in the search space of the genetic algorithm 
(though the algorithm is not guaranteed to find it). 

4.5 Other risk measures 

Note that it is easy to change the objective function in a genetic algorithm. SP re- 
searchers have recently explored risk-averse disaster planning including transportation 
networks [10], and we can use risk-averse objective functions such as conditional value- 
at-risk (CVaR) for a single source-sink pair. For multiple pairs we can take a weighted 
sum of CVaRs. For our method to work well, we must optimise some function of sta- 
tistical parameters computed on each pair separately. 

5 Conclusion 

In future work we shall implement metaheuristic and exact algorithms to solve the re- 
duced problem. We shall also apply the reduction technique to other problems. 
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